{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Copyright (c) Microsoft Corporation. All rights reserved.  \n",
    "Licensed under the MIT License."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Mobilenet v2 Quantization with ONNX Runtime on CPU"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this tutorial, we will load a mobilenet v2 model pretrained with [PyTorch](https://pytorch.org/), export the model to ONNX, and quantize then run with ONNXRuntime."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 0. Prerequisites ##\n",
    "\n",
    "If you have Jupyter Notebook, you can run this notebook directly with it. You may need to install or upgrade [PyTorch](https://pytorch.org/), [OnnxRuntime](https://microsoft.github.io/onnxruntime/), and other required packages.\n",
    "\n",
    "Otherwise, you can setup a new environment. First, install [Anaconda](https://www.anaconda.com/distribution/). Then open an AnaConda prompt window and run the following commands:\n",
    "\n",
    "```console\n",
    "conda create -n cpu_env python=3.8\n",
    "conda activate cpu_env\n",
    "conda install jupyter\n",
    "jupyter notebook\n",
    "```\n",
    "The last command will launch Jupyter Notebook and we can open this notebook in browser to continue."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 0.1 Install packages\n",
    "Let's install the necessary packages to start the tutorial. We will install PyTorch 1.8, OnnxRuntime 1.8, latest ONNX and pillow."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Install or upgrade PyTorch 1.8.0 and OnnxRuntime 1.8 for CPU-only.\n",
    "import sys\n",
    "!{sys.executable} -m pip install --upgrade torch==1.8.0 torchvision==0.9.0 torchaudio===0.8.0 -f https://download.pytorch.org/whl/torch_stable.html\n",
    "!{sys.executable} -m pip install --upgrade onnxruntime==1.8.0\n",
    "!{sys.executable} -m pip install --upgrade onnx\n",
    "!{sys.executable} -m pip install --upgrade pillow"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 1 Download pretrained model and export to ONNX"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this step, we load a pretrained mobilenet v2 model, and export it to ONNX."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.1 Load the pretrained model\n",
    "Use torchvision provides API to load mobilenet_v2 model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from torchvision import models, datasets, transforms as T\n",
    "mobilenet_v2 = models.mobilenet_v2(pretrained=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.2 Export the model to ONNX\n",
    "Pytorch onnx export API to export the model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "image_height = 224\n",
    "image_width = 224\n",
    "x = torch.randn(1, 3, image_height, image_width, requires_grad=True)\n",
    "torch_out = mobilenet_v2(x)\n",
    "\n",
    "# Export the model\n",
    "torch.onnx.export(mobilenet_v2,              # model being run\n",
    "                  x,                         # model input (or a tuple for multiple inputs)\n",
    "                  \"mobilenet_v2_float.onnx\", # where to save the model (can be a file or file-like object)\n",
    "                  export_params=True,        # store the trained parameter weights inside the model file\n",
    "                  opset_version=12,          # the ONNX version to export the model to\n",
    "                  do_constant_folding=True,  # whether to execute constant folding for optimization\n",
    "                  input_names = ['input'],   # the model's input names\n",
    "                  output_names = ['output']) # the model's output names\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.3 Sample Execution with ONNXRuntime"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Run an sample with the full precision ONNX model. Firstly, implement the preprocess."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from PIL import Image\n",
    "import numpy as np\n",
    "import onnxruntime\n",
    "import torch\n",
    "\n",
    "def preprocess_image(image_path, height, width, channels=3):\n",
    "    image = Image.open(image_path)\n",
    "    image = image.resize((width, height), Image.ANTIALIAS)\n",
    "    image_data = np.asarray(image).astype(np.float32)\n",
    "    image_data = image_data.transpose([2, 0, 1]) # transpose to CHW\n",
    "    mean = np.array([0.079, 0.05, 0]) + 0.406\n",
    "    std = np.array([0.005, 0, 0.001]) + 0.224\n",
    "    for channel in range(image_data.shape[0]):\n",
    "        image_data[channel, :, :] = (image_data[channel, :, :] / 255 - mean[channel]) / std[channel]\n",
    "    image_data = np.expand_dims(image_data, 0)\n",
    "    return image_data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Download the imagenet labels and load it"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Download ImageNet labels\n",
    "!curl -o imagenet_classes.txt https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt\n",
    "\n",
    "# Read the categories\n",
    "with open(\"imagenet_classes.txt\", \"r\") as f:\n",
    "    categories = [s.strip() for s in f.readlines()]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Run the example with ONNXRuntime"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "session_fp32 = onnxruntime.InferenceSession(\"mobilenet_v2_float.onnx\")\n",
    "\n",
    "def softmax(x):\n",
    "    \"\"\"Compute softmax values for each sets of scores in x.\"\"\"\n",
    "    e_x = np.exp(x - np.max(x))\n",
    "    return e_x / e_x.sum()\n",
    "\n",
    "def run_sample(session, image_file, categories):\n",
    "    output = session.run([], {'input':preprocess_image(image_file, image_height, image_width)})[0]\n",
    "    output = output.flatten()\n",
    "    output = softmax(output) # this is optional\n",
    "    top5_catid = np.argsort(-output)[:5]\n",
    "    for catid in top5_catid:\n",
    "        print(categories[catid], output[catid])\n",
    "\n",
    "run_sample(session_fp32, 'cat.jpg', categories)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 2 Quantize the model with ONNXRuntime \n",
    "In this step, we load the full precison model, and quantize it with ONNXRuntime quantization tool. And show the model size comparison between full precision and quantized model. Finally, we run the same sample with the quantized model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.1 Implement a CalibrationDataReader\n",
    "CalibrationDataReader takes in calibration data and generates input for the model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from onnxruntime.quantization import quantize_static, CalibrationDataReader, QuantType\n",
    "import os\n",
    "\n",
    "def preprocess_func(images_folder, height, width, size_limit=0):\n",
    "    image_names = os.listdir(images_folder)\n",
    "    if size_limit > 0 and len(image_names) >= size_limit:\n",
    "        batch_filenames = [image_names[i] for i in range(size_limit)]\n",
    "    else:\n",
    "        batch_filenames = image_names\n",
    "    unconcatenated_batch_data = []\n",
    "\n",
    "    for image_name in batch_filenames:\n",
    "        image_filepath = images_folder + '/' + image_name\n",
    "        image_data = preprocess_image(image_filepath, height, width)\n",
    "        unconcatenated_batch_data.append(image_data)\n",
    "    batch_data = np.concatenate(np.expand_dims(unconcatenated_batch_data, axis=0), axis=0)\n",
    "    return batch_data\n",
    "\n",
    "\n",
    "class MobilenetDataReader(CalibrationDataReader):\n",
    "    def __init__(self, calibration_image_folder):\n",
    "        self.image_folder = calibration_image_folder\n",
    "        self.preprocess_flag = True\n",
    "        self.enum_data_dicts = []\n",
    "        self.datasize = 0\n",
    "\n",
    "    def get_next(self):\n",
    "        if self.preprocess_flag:\n",
    "            self.preprocess_flag = False\n",
    "            nhwc_data_list = preprocess_func(self.image_folder, image_height, image_width, size_limit=0)\n",
    "            self.datasize = len(nhwc_data_list)\n",
    "            self.enum_data_dicts = iter([{'input': nhwc_data} for nhwc_data in nhwc_data_list])\n",
    "        return next(self.enum_data_dicts, None)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.2 Quantize the model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we can not upload full calibration data set for copy right issue, we only demonstrate with some example images. You need to use your own calibration data set in practice."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# change it to your real calibration data set\n",
    "calibration_data_folder = \"calibration_imagenet\"\n",
    "dr = MobilenetDataReader(calibration_data_folder)\n",
    "\n",
    "quantize_static('mobilenet_v2_float.onnx',\n",
    "                'mobilenet_v2_uint8.onnx',\n",
    "                dr)\n",
    "\n",
    "print('ONNX full precision model size (MB):', os.path.getsize(\"mobilenet_v2_float.onnx\")/(1024*1024))\n",
    "print('ONNX quantized model size (MB):', os.path.getsize(\"mobilenet_v2_uint8.onnx\")/(1024*1024))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.3 Run the model with OnnxRuntime"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "session_quant = onnxruntime.InferenceSession(\"mobilenet_v2_uint8.onnx\")\n",
    "run_sample(session_quant, 'cat.jpg', categories)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
